Meta-learning beyond classification: A framework for information extraction from the Web
نویسندگان
چکیده
This paper proposes a meta-learning framework in the context of information extraction from the Web. The proposed framework relies on learning a meta-level classifier, based on the output of base-level information extraction systems. Such systems are typically trained to recognize relevant information within documents, i.e., streams of lexical units, which differs significantly from the task of classifying feature vectors that is commonly assumed for metalearning. The proposed framework was evaluated experimentally on the challenging task of training an information extraction system for multiple Web sites. Three well-known methods for training extraction systems were employed at the base level. A variety of classifiers were comparatively evaluated at the meta level. The extraction accuracy that was obtained demonstrated the effectiveness of the proposed framework of collaboration between base-level extraction systems and common classifiers at meta-level.
منابع مشابه
Presenting a method for extracting structured domain-dependent information from Farsi Web pages
Extracting structured information about entities from web texts is an important task in web mining, natural language processing, and information extraction. Information extraction is useful in many applications including search engines, question-answering systems, recommender systems, machine translation, etc. An information extraction system aims to identify the entities from the text and extr...
متن کامل1 Learning Named-Entity Recognition Rules
This paper defines a new stacked generalization framework in the context of information extraction (IE) from online sources. The proposed setting removes the constraint of applying classifiers at the base-level. A set of IE systems are trained instead to identify relevant fragments within text documents, which differs significantly from the task of classifying candidate text fragments as releva...
متن کاملDeveloping a New Method in Object Based Classification to Updating Large Scale Maps with Emphasis on Building Feature
According to the cities expansion, updating urban maps for urban planning is important and its effectiveness is depend on the information extraction / change detection accuracy. Information extraction methods are divided into two groups, including Pixel-Based (PB) and Object-Based (OB). OB analysis has overcome the limitations of PB analysis (producing salt-pepper results and features with hole...
متن کاملAdaptive Information Analysis in Higher Education Institutes
Information integration plays an important role in academic environments since it provides a comprehensive view of education data and enables mangers to analyze and evaluate the effectiveness of education processes. However, the problem in the traditional information integration is the lack of personalization due to weak information resource or unavailability of analysis functionality. In this ...
متن کاملAdaptive Information Analysis in Higher Education Institutes
Information integration plays an important role in academic environments since it provides a comprehensive view of education data and enables mangers to analyze and evaluate the effectiveness of education processes. However, the problem in the traditional information integration is the lack of personalization due to weak information resource or unavailability of analysis functionality. In this ...
متن کامل